Overview

Dataset statistics

Number of variables23
Number of observations9527
Missing cells10496
Missing cells (%)4.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory184.0 B

Variable types

Categorical14
Numeric9

Warnings

ID has a high cardinality: 9527 distinct values High cardinality
Application_Receipt_Date has a high cardinality: 357 distinct values High cardinality
Applicant_BirthDate has a high cardinality: 5836 distinct values High cardinality
Manager_DOJ has a high cardinality: 646 distinct values High cardinality
Manager_DoB has a high cardinality: 1245 distinct values High cardinality
Office_PIN is highly correlated with Applicant_City_PINHigh correlation
Applicant_City_PIN is highly correlated with Office_PINHigh correlation
Manager_Num_Application is highly correlated with Manager_Num_CodedHigh correlation
Manager_Num_Coded is highly correlated with Manager_Num_ApplicationHigh correlation
Manager_Business is highly correlated with Manager_Num_Products and 2 other fieldsHigh correlation
Manager_Num_Products is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Business2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Num_Products2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Office_PIN is highly correlated with Applicant_City_PINHigh correlation
Applicant_City_PIN is highly correlated with Office_PINHigh correlation
Manager_Num_Application is highly correlated with Manager_Num_CodedHigh correlation
Manager_Num_Coded is highly correlated with Manager_Num_ApplicationHigh correlation
Manager_Business is highly correlated with Manager_Num_Products and 2 other fieldsHigh correlation
Manager_Num_Products is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Business2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Num_Products2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Office_PIN is highly correlated with Applicant_City_PINHigh correlation
Applicant_City_PIN is highly correlated with Office_PINHigh correlation
Manager_Business is highly correlated with Manager_Num_Products and 2 other fieldsHigh correlation
Manager_Num_Products is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Business2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Num_Products2 is highly correlated with Manager_Business and 2 other fieldsHigh correlation
Manager_Business2 is highly correlated with Manager_Business and 3 other fieldsHigh correlation
Manager_Business is highly correlated with Manager_Business2 and 3 other fieldsHigh correlation
Manager_Grade is highly correlated with Manager_Joining_Designation and 1 other fieldsHigh correlation
Manager_Joining_Designation is highly correlated with Manager_Grade and 2 other fieldsHigh correlation
Manager_Num_Products is highly correlated with Manager_Business2 and 3 other fieldsHigh correlation
Manager_Current_Designation is highly correlated with Manager_Grade and 1 other fieldsHigh correlation
Manager_Status is highly correlated with Manager_Business2 and 3 other fieldsHigh correlation
Applicant_City_PIN is highly correlated with Office_PINHigh correlation
Manager_Num_Products2 is highly correlated with Manager_Business2 and 2 other fieldsHigh correlation
Office_PIN is highly correlated with Applicant_City_PINHigh correlation
Manager_Joining_Designation is highly correlated with Manager_Current_DesignationHigh correlation
Manager_Current_Designation is highly correlated with Manager_Joining_DesignationHigh correlation
Applicant_City_PIN has 97 (1.0%) missing values Missing
Applicant_Occupation has 1221 (12.8%) missing values Missing
Manager_DOJ has 683 (7.2%) missing values Missing
Manager_Joining_Designation has 683 (7.2%) missing values Missing
Manager_Current_Designation has 683 (7.2%) missing values Missing
Manager_Grade has 683 (7.2%) missing values Missing
Manager_Status has 683 (7.2%) missing values Missing
Manager_Gender has 683 (7.2%) missing values Missing
Manager_DoB has 683 (7.2%) missing values Missing
Manager_Num_Application has 683 (7.2%) missing values Missing
Manager_Num_Coded has 683 (7.2%) missing values Missing
Manager_Business has 683 (7.2%) missing values Missing
Manager_Num_Products has 683 (7.2%) missing values Missing
Manager_Business2 has 683 (7.2%) missing values Missing
Manager_Num_Products2 has 683 (7.2%) missing values Missing
ID is uniformly distributed Uniform
Applicant_BirthDate is uniformly distributed Uniform
ID has unique values Unique
Manager_Num_Application has 2980 (31.3%) zeros Zeros
Manager_Num_Coded has 5283 (55.5%) zeros Zeros
Manager_Business has 2904 (30.5%) zeros Zeros
Manager_Num_Products has 2909 (30.5%) zeros Zeros
Manager_Business2 has 2909 (30.5%) zeros Zeros
Manager_Num_Products2 has 2914 (30.6%) zeros Zeros

Reproduction

Analysis started2021-07-27 05:43:48.977680
Analysis finished2021-07-27 05:44:17.593113
Duration28.62 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct9527
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size74.6 KiB
FIN1000297
 
1
FIN1008090
 
1
FIN1006786
 
1
FIN1003695
 
1
FIN1001949
 
1
Other values (9522)
9522 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters95270
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9527 ?
Unique (%)100.0%

Sample

1st rowFIN1000001
2nd rowFIN1000002
3rd rowFIN1000003
4th rowFIN1000004
5th rowFIN1000005

Common Values

ValueCountFrequency (%)
FIN10002971
 
< 0.1%
FIN10080901
 
< 0.1%
FIN10067861
 
< 0.1%
FIN10036951
 
< 0.1%
FIN10019491
 
< 0.1%
FIN10052841
 
< 0.1%
FIN10005201
 
< 0.1%
FIN10089521
 
< 0.1%
FIN10041521
 
< 0.1%
FIN10034291
 
< 0.1%
Other values (9517)9517
99.9%

Length

2021-07-27T11:14:17.908699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fin10058421
 
< 0.1%
fin10088631
 
< 0.1%
fin10083431
 
< 0.1%
fin10022991
 
< 0.1%
fin10021381
 
< 0.1%
fin10018091
 
< 0.1%
fin10065531
 
< 0.1%
fin10032371
 
< 0.1%
fin10034081
 
< 0.1%
fin10078311
 
< 0.1%
Other values (9517)9517
99.9%

Most occurring characters

ValueCountFrequency (%)
022963
24.1%
113440
14.1%
F9527
10.0%
I9527
10.0%
N9527
10.0%
23911
 
4.1%
33903
 
4.1%
43903
 
4.1%
53831
 
4.0%
63803
 
4.0%
Other values (3)10935
11.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number66689
70.0%
Uppercase Letter28581
30.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
022963
34.4%
113440
20.2%
23911
 
5.9%
33903
 
5.9%
43903
 
5.9%
53831
 
5.7%
63803
 
5.7%
73803
 
5.7%
83802
 
5.7%
93330
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
F9527
33.3%
I9527
33.3%
N9527
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common66689
70.0%
Latin28581
30.0%

Most frequent character per script

Common
ValueCountFrequency (%)
022963
34.4%
113440
20.2%
23911
 
5.9%
33903
 
5.9%
43903
 
5.9%
53831
 
5.7%
63803
 
5.7%
73803
 
5.7%
83802
 
5.7%
93330
 
5.0%
Latin
ValueCountFrequency (%)
F9527
33.3%
I9527
33.3%
N9527
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII95270
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
022963
24.1%
113440
14.1%
F9527
10.0%
I9527
10.0%
N9527
10.0%
23911
 
4.1%
33903
 
4.1%
43903
 
4.1%
53831
 
4.0%
63803
 
4.0%
Other values (3)10935
11.5%

Office_PIN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct98
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean452894.3722
Minimum110005
Maximum851101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:18.033544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum110005
5-th percentile122002
Q1226001
median416001
Q3695014
95-th percentile841428
Maximum851101
Range741096
Interquartile range (IQR)469013

Descriptive statistics

Standard deviation235690.6183
Coefficient of variation (CV)0.5204096865
Kurtosis-1.28405354
Mean452894.3722
Median Absolute Deviation (MAD)194991
Skewness0.3027009114
Sum4314724684
Variance5.555006753 × 1010
MonotonicityNot monotonic
2021-07-27T11:14:18.173218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
695014397
 
4.2%
211001257
 
2.7%
221010249
 
2.6%
121002236
 
2.5%
400075216
 
2.3%
700016192
 
2.0%
444601187
 
2.0%
201301184
 
1.9%
208001180
 
1.9%
226001176
 
1.8%
Other values (88)7253
76.1%
ValueCountFrequency (%)
110005146
1.5%
1100343
 
< 0.1%
121002236
2.5%
12200298
1.0%
12400110
 
0.1%
12500178
 
0.8%
14100178
 
0.8%
14300112
 
0.1%
1440012
 
< 0.1%
16001766
 
0.7%
ValueCountFrequency (%)
851101104
1.1%
84810180
0.8%
843302110
1.2%
842001145
1.5%
84142899
1.0%
84122666
0.7%
834001120
1.3%
8260016
 
0.1%
82410171
0.7%
81411284
0.9%

Application_Receipt_Date
Categorical

HIGH CARDINALITY

Distinct357
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size74.6 KiB
5/9/2007
 
165
5/8/2007
 
97
4/18/2007
 
86
5/7/2007
 
86
1/2/2008
 
85
Other values (352)
9008 

Length

Max length10
Median length9
Mean length8.931562926
Min length8

Characters and Unicode

Total characters85091
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row4/16/2007
2nd row4/16/2007
3rd row4/16/2007
4th row4/16/2007
5th row4/16/2007

Common Values

ValueCountFrequency (%)
5/9/2007165
 
1.7%
5/8/200797
 
1.0%
4/18/200786
 
0.9%
5/7/200786
 
0.9%
1/2/200885
 
0.9%
11/12/200783
 
0.9%
5/5/200880
 
0.8%
4/16/200779
 
0.8%
12/6/200773
 
0.8%
11/19/200771
 
0.7%
Other values (347)8622
90.5%

Length

2021-07-27T11:14:18.496249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5/9/2007165
 
1.7%
5/8/200797
 
1.0%
4/18/200786
 
0.9%
5/7/200786
 
0.9%
1/2/200885
 
0.9%
11/12/200783
 
0.9%
5/5/200880
 
0.8%
4/16/200779
 
0.8%
12/6/200773
 
0.8%
11/19/200771
 
0.7%
Other values (347)8622
90.5%

Most occurring characters

ValueCountFrequency (%)
020475
24.1%
/19054
22.4%
214378
16.9%
18431
9.9%
78009
 
9.4%
84900
 
5.8%
52841
 
3.3%
62221
 
2.6%
41863
 
2.2%
91585
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number66037
77.6%
Other Punctuation19054
 
22.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
020475
31.0%
214378
21.8%
18431
12.8%
78009
 
12.1%
84900
 
7.4%
52841
 
4.3%
62221
 
3.4%
41863
 
2.8%
91585
 
2.4%
31334
 
2.0%
Other Punctuation
ValueCountFrequency (%)
/19054
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common85091
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
020475
24.1%
/19054
22.4%
214378
16.9%
18431
9.9%
78009
 
9.4%
84900
 
5.8%
52841
 
3.3%
62221
 
2.6%
41863
 
2.2%
91585
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII85091
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
020475
24.1%
/19054
22.4%
214378
16.9%
18431
9.9%
78009
 
9.4%
84900
 
5.8%
52841
 
3.3%
62221
 
2.6%
41863
 
2.2%
91585
 
1.9%

Applicant_City_PIN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2979
Distinct (%)31.6%
Missing97
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean456784.5473
Minimum110001
Maximum995657
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:18.641819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum110001
5-th percentile121102
Q1226020
median422005.5
Q3695017
95-th percentile843121
Maximum995657
Range885656
Interquartile range (IQR)468997

Descriptive statistics

Standard deviation239291.0812
Coefficient of variation (CV)0.5238598429
Kurtosis-1.302812578
Mean456784.5473
Median Absolute Deviation (MAD)209497
Skewness0.2739133409
Sum4307478281
Variance5.726022155 × 1010
MonotonicityNot monotonic
2021-07-27T11:14:18.799018image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
202001103
 
1.1%
49200175
 
0.8%
30500164
 
0.7%
45200155
 
0.6%
47600151
 
0.5%
28100149
 
0.5%
12500148
 
0.5%
28500147
 
0.5%
80310146
 
0.5%
27400146
 
0.5%
Other values (2969)8846
92.9%
(Missing)97
 
1.0%
ValueCountFrequency (%)
1100012
< 0.1%
1100033
< 0.1%
1100042
< 0.1%
1100052
< 0.1%
1100064
< 0.1%
1100074
< 0.1%
1100081
 
< 0.1%
1100092
< 0.1%
1100101
 
< 0.1%
1100144
< 0.1%
ValueCountFrequency (%)
9956571
 
< 0.1%
8886201
 
< 0.1%
8561271
 
< 0.1%
8541011
 
< 0.1%
85320411
0.1%
8532023
 
< 0.1%
8532017
0.1%
8531021
 
< 0.1%
8521111
 
< 0.1%
8512311
 
< 0.1%

Applicant_Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing67
Missing (%)0.7%
Memory size74.6 KiB
M
7179 
F
2281 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9460
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
M7179
75.4%
F2281
 
23.9%
(Missing)67
 
0.7%

Length

2021-07-27T11:14:19.052708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:19.151863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
m7179
75.9%
f2281
 
24.1%

Most occurring characters

ValueCountFrequency (%)
M7179
75.9%
F2281
 
24.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter9460
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M7179
75.9%
F2281
 
24.1%

Most occurring scripts

ValueCountFrequency (%)
Latin9460
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M7179
75.9%
F2281
 
24.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M7179
75.9%
F2281
 
24.1%

Applicant_BirthDate
Categorical

HIGH CARDINALITY
UNIFORM

Distinct5836
Distinct (%)61.7%
Missing73
Missing (%)0.8%
Memory size74.6 KiB
1/3/1978
 
24
1/3/1980
 
20
1/2/1977
 
18
1/3/1979
 
16
1/3/1968
 
13
Other values (5831)
9363 

Length

Max length10
Median length9
Mean length8.806854242
Min length8

Characters and Unicode

Total characters83260
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3738 ?
Unique (%)39.5%

Sample

1st row12/19/1971
2nd row2/17/1983
3rd row1/16/1966
4th row2/3/1988
5th row7/4/1985

Common Values

ValueCountFrequency (%)
1/3/197824
 
0.3%
1/3/198020
 
0.2%
1/2/197718
 
0.2%
1/3/197916
 
0.2%
1/3/196813
 
0.1%
1/3/198313
 
0.1%
1/3/197613
 
0.1%
1/2/198111
 
0.1%
7/2/198010
 
0.1%
1/3/198410
 
0.1%
Other values (5826)9306
97.7%
(Missing)73
 
0.8%

Length

2021-07-27T11:14:19.418321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1/3/197824
 
0.3%
1/3/198020
 
0.2%
1/2/197718
 
0.2%
1/3/197916
 
0.2%
1/3/198313
 
0.1%
1/3/196813
 
0.1%
1/3/197613
 
0.1%
1/2/198111
 
0.1%
7/3/197710
 
0.1%
1/2/197310
 
0.1%
Other values (5826)9306
98.4%

Most occurring characters

ValueCountFrequency (%)
/18908
22.7%
118022
21.6%
911626
14.0%
76754
 
8.1%
86238
 
7.5%
26091
 
7.3%
64331
 
5.2%
33302
 
4.0%
53088
 
3.7%
42742
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64352
77.3%
Other Punctuation18908
 
22.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
118022
28.0%
911626
18.1%
76754
 
10.5%
86238
 
9.7%
26091
 
9.5%
64331
 
6.7%
33302
 
5.1%
53088
 
4.8%
42742
 
4.3%
02158
 
3.4%
Other Punctuation
ValueCountFrequency (%)
/18908
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common83260
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/18908
22.7%
118022
21.6%
911626
14.0%
76754
 
8.1%
86238
 
7.5%
26091
 
7.3%
64331
 
5.2%
33302
 
4.0%
53088
 
3.7%
42742
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII83260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/18908
22.7%
118022
21.6%
911626
14.0%
76754
 
8.1%
86238
 
7.5%
26091
 
7.3%
64331
 
5.2%
33302
 
4.0%
53088
 
3.7%
42742
 
3.3%
Distinct4
Distinct (%)< 0.1%
Missing73
Missing (%)0.8%
Memory size74.6 KiB
M
6177 
S
3267 
W
 
6
D
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9454
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowS
3rd rowM
4th rowS
5th rowM

Common Values

ValueCountFrequency (%)
M6177
64.8%
S3267
34.3%
W6
 
0.1%
D4
 
< 0.1%
(Missing)73
 
0.8%

Length

2021-07-27T11:14:19.830200image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:19.906202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
m6177
65.3%
s3267
34.6%
w6
 
0.1%
d4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
M6177
65.3%
S3267
34.6%
W6
 
0.1%
D4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter9454
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M6177
65.3%
S3267
34.6%
W6
 
0.1%
D4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin9454
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M6177
65.3%
S3267
34.6%
W6
 
0.1%
D4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9454
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M6177
65.3%
S3267
34.6%
W6
 
0.1%
D4
 
< 0.1%

Applicant_Occupation
Categorical

MISSING

Distinct5
Distinct (%)0.1%
Missing1221
Missing (%)12.8%
Memory size74.6 KiB
Salaried
3787 
Business
2303 
Others
1966 
Self Employed
 
149
Student
 
101

Length

Max length13
Median length8
Mean length7.604141584
Min length6

Characters and Unicode

Total characters63160
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOthers
2nd rowOthers
3rd rowBusiness
4th rowSalaried
5th rowOthers

Common Values

ValueCountFrequency (%)
Salaried3787
39.8%
Business2303
24.2%
Others1966
20.6%
Self Employed149
 
1.6%
Student101
 
1.1%
(Missing)1221
 
12.8%

Length

2021-07-27T11:14:20.165440image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:20.257451image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
salaried3787
44.8%
business2303
27.2%
others1966
23.3%
self149
 
1.8%
employed149
 
1.8%
student101
 
1.2%

Most occurring characters

ValueCountFrequency (%)
s8875
14.1%
e8455
13.4%
a7574
12.0%
i6090
9.6%
r5753
9.1%
l4085
6.5%
S4037
6.4%
d4037
6.4%
u2404
 
3.8%
n2404
 
3.8%
Other values (11)9446
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter54556
86.4%
Uppercase Letter8455
 
13.4%
Space Separator149
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s8875
16.3%
e8455
15.5%
a7574
13.9%
i6090
11.2%
r5753
10.5%
l4085
7.5%
d4037
7.4%
u2404
 
4.4%
n2404
 
4.4%
t2168
 
4.0%
Other values (6)2711
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
S4037
47.7%
B2303
27.2%
O1966
23.3%
E149
 
1.8%
Space Separator
ValueCountFrequency (%)
149
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin63011
99.8%
Common149
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
s8875
14.1%
e8455
13.4%
a7574
12.0%
i6090
9.7%
r5753
9.1%
l4085
6.5%
S4037
6.4%
d4037
6.4%
u2404
 
3.8%
n2404
 
3.8%
Other values (10)9297
14.8%
Common
ValueCountFrequency (%)
149
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII63160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s8875
14.1%
e8455
13.4%
a7574
12.0%
i6090
9.6%
r5753
9.1%
l4085
6.5%
S4037
6.4%
d4037
6.4%
u2404
 
3.8%
n2404
 
3.8%
Other values (11)9446
15.0%
Distinct11
Distinct (%)0.1%
Missing86
Missing (%)0.9%
Memory size74.6 KiB
Class XII
5806 
Graduate
3196 
Class X
 
225
Others
 
132
Masters of Business Administration
 
74
Other values (6)
 
8

Length

Max length64
Median length9
Mean length8.806694206
Min length6

Characters and Unicode

Total characters83144
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowGraduate
2nd rowClass XII
3rd rowClass XII
4th rowClass XII
5th rowClass XII

Common Values

ValueCountFrequency (%)
Class XII5806
60.9%
Graduate3196
33.5%
Class X225
 
2.4%
Others132
 
1.4%
Masters of Business Administration74
 
0.8%
Associate / Fellow of Institute of Chartered Accountans of India3
 
< 0.1%
Professional Qualification in Marketing1
 
< 0.1%
Associate/Fellow of Institute of Company Secretories of India1
 
< 0.1%
Associate/Fellow of Insurance Institute of India1
 
< 0.1%
Certified Associateship of Indian Institute of Bankers1
 
< 0.1%
(Missing)86
 
0.9%

Length

2021-07-27T11:14:20.517050image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
class6031
38.3%
xii5806
36.9%
graduate3196
20.3%
x225
 
1.4%
others132
 
0.8%
of92
 
0.6%
administration74
 
0.5%
business74
 
0.5%
masters74
 
0.5%
institute6
 
< 0.1%
Other values (20)37
 
0.2%

Most occurring characters

ValueCountFrequency (%)
s12667
15.2%
a12599
15.2%
I11626
14.0%
6306
7.6%
l6046
7.3%
C6036
7.3%
X6031
7.3%
t3587
 
4.3%
e3511
 
4.2%
r3490
 
4.2%
Other values (24)11245
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter49566
59.6%
Uppercase Letter27266
32.8%
Space Separator6306
 
7.6%
Other Punctuation6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s12667
25.6%
a12599
25.4%
l6046
12.2%
t3587
 
7.2%
e3511
 
7.1%
r3490
 
7.0%
u3282
 
6.6%
d3281
 
6.6%
i328
 
0.7%
n250
 
0.5%
Other values (10)525
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
I11626
42.6%
C6036
22.1%
X6031
22.1%
G3196
 
11.7%
O132
 
0.5%
A85
 
0.3%
M75
 
0.3%
B75
 
0.3%
F6
 
< 0.1%
S2
 
< 0.1%
Other values (2)2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
6306
100.0%
Other Punctuation
ValueCountFrequency (%)
/6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin76832
92.4%
Common6312
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s12667
16.5%
a12599
16.4%
I11626
15.1%
l6046
7.9%
C6036
7.9%
X6031
7.8%
t3587
 
4.7%
e3511
 
4.6%
r3490
 
4.5%
u3282
 
4.3%
Other values (22)7957
10.4%
Common
ValueCountFrequency (%)
6306
99.9%
/6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII83144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s12667
15.2%
a12599
15.2%
I11626
14.0%
6306
7.6%
l6046
7.3%
C6036
7.3%
X6031
7.3%
t3587
 
4.3%
e3511
 
4.2%
r3490
 
4.2%
Other values (24)11245
13.5%

Manager_DOJ
Categorical

HIGH CARDINALITY
MISSING

Distinct646
Distinct (%)7.3%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
7/9/2007
 
106
6/11/2007
 
76
11/6/2007
 
71
5/8/2006
 
69
12/3/2007
 
67
Other values (641)
8455 

Length

Max length10
Median length9
Mean length8.968905473
Min length8

Characters and Unicode

Total characters79321
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)0.6%

Sample

1st row11/10/2005
2nd row11/10/2005
3rd row5/27/2006
4th row8/21/2003
5th row5/8/2006

Common Values

ValueCountFrequency (%)
7/9/2007106
 
1.1%
6/11/200776
 
0.8%
11/6/200771
 
0.7%
5/8/200669
 
0.7%
12/3/200767
 
0.7%
7/28/200367
 
0.7%
5/11/200667
 
0.7%
6/25/200764
 
0.7%
8/20/200763
 
0.7%
7/3/200754
 
0.6%
Other values (636)8140
85.4%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:20.824669image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7/9/2007106
 
1.2%
6/11/200776
 
0.9%
11/6/200771
 
0.8%
5/8/200669
 
0.8%
5/11/200667
 
0.8%
7/28/200367
 
0.8%
12/3/200767
 
0.8%
6/25/200764
 
0.7%
8/20/200763
 
0.7%
7/3/200754
 
0.6%
Other values (636)8140
92.0%

Most occurring characters

ValueCountFrequency (%)
019423
24.5%
/17688
22.3%
213727
17.3%
18374
10.6%
75223
 
6.6%
63936
 
5.0%
52595
 
3.3%
82518
 
3.2%
32485
 
3.1%
41855
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number61633
77.7%
Other Punctuation17688
 
22.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019423
31.5%
213727
22.3%
18374
13.6%
75223
 
8.5%
63936
 
6.4%
52595
 
4.2%
82518
 
4.1%
32485
 
4.0%
41855
 
3.0%
91497
 
2.4%
Other Punctuation
ValueCountFrequency (%)
/17688
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common79321
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019423
24.5%
/17688
22.3%
213727
17.3%
18374
10.6%
75223
 
6.6%
63936
 
5.0%
52595
 
3.3%
82518
 
3.2%
32485
 
3.1%
41855
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII79321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019423
24.5%
/17688
22.3%
213727
17.3%
18374
10.6%
75223
 
6.6%
63936
 
5.0%
52595
 
3.3%
82518
 
3.2%
32485
 
3.1%
41855
 
2.3%

Manager_Joining_Designation
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)0.1%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
Level 1
4632 
Level 2
2787 
Level 3
1146 
Level 4
 
200
Other
 
58
Other values (3)
 
21

Length

Max length7
Median length7
Mean length6.986883763
Min length5

Characters and Unicode

Total characters61792
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLevel 1
2nd rowLevel 1
3rd rowLevel 1
4th rowLevel 1
5th rowLevel 1

Common Values

ValueCountFrequency (%)
Level 14632
48.6%
Level 22787
29.3%
Level 31146
 
12.0%
Level 4200
 
2.1%
Other58
 
0.6%
Level 618
 
0.2%
Level 72
 
< 0.1%
Level 51
 
< 0.1%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:21.075902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:21.171852image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
level8786
49.8%
14632
26.3%
22787
 
15.8%
31146
 
6.5%
4200
 
1.1%
other58
 
0.3%
618
 
0.1%
72
 
< 0.1%
51
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e17630
28.5%
L8786
14.2%
v8786
14.2%
l8786
14.2%
8786
14.2%
14632
 
7.5%
22787
 
4.5%
31146
 
1.9%
4200
 
0.3%
O58
 
0.1%
Other values (6)195
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter35376
57.3%
Uppercase Letter8844
 
14.3%
Space Separator8786
 
14.2%
Decimal Number8786
 
14.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
14632
52.7%
22787
31.7%
31146
 
13.0%
4200
 
2.3%
618
 
0.2%
72
 
< 0.1%
51
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
e17630
49.8%
v8786
24.8%
l8786
24.8%
t58
 
0.2%
h58
 
0.2%
r58
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
L8786
99.3%
O58
 
0.7%
Space Separator
ValueCountFrequency (%)
8786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin44220
71.6%
Common17572
 
28.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e17630
39.9%
L8786
19.9%
v8786
19.9%
l8786
19.9%
O58
 
0.1%
t58
 
0.1%
h58
 
0.1%
r58
 
0.1%
Common
ValueCountFrequency (%)
8786
50.0%
14632
26.4%
22787
 
15.9%
31146
 
6.5%
4200
 
1.1%
618
 
0.1%
72
 
< 0.1%
51
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII61792
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e17630
28.5%
L8786
14.2%
v8786
14.2%
l8786
14.2%
8786
14.2%
14632
 
7.5%
22787
 
4.5%
31146
 
1.9%
4200
 
0.3%
O58
 
0.1%
Other values (6)195
 
0.3%

Manager_Current_Designation
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)0.1%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
Level 2
3208 
Level 1
2479 
Level 3
2033 
Level 4
1031 
Level 5
 
93

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters61908
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLevel 2
2nd rowLevel 2
3rd rowLevel 1
4th rowLevel 3
5th rowLevel 1

Common Values

ValueCountFrequency (%)
Level 23208
33.7%
Level 12479
26.0%
Level 32033
21.3%
Level 41031
 
10.8%
Level 593
 
1.0%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:21.466826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:21.544165image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
level8844
50.0%
23208
 
18.1%
12479
 
14.0%
32033
 
11.5%
41031
 
5.8%
593
 
0.5%

Most occurring characters

ValueCountFrequency (%)
e17688
28.6%
L8844
14.3%
v8844
14.3%
l8844
14.3%
8844
14.3%
23208
 
5.2%
12479
 
4.0%
32033
 
3.3%
41031
 
1.7%
593
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter35376
57.1%
Uppercase Letter8844
 
14.3%
Space Separator8844
 
14.3%
Decimal Number8844
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
23208
36.3%
12479
28.0%
32033
23.0%
41031
 
11.7%
593
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
e17688
50.0%
v8844
25.0%
l8844
25.0%
Uppercase Letter
ValueCountFrequency (%)
L8844
100.0%
Space Separator
ValueCountFrequency (%)
8844
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin44220
71.4%
Common17688
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
8844
50.0%
23208
 
18.1%
12479
 
14.0%
32033
 
11.5%
41031
 
5.8%
593
 
0.5%
Latin
ValueCountFrequency (%)
e17688
40.0%
L8844
20.0%
v8844
20.0%
l8844
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII61908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e17688
28.6%
L8844
14.3%
v8844
14.3%
l8844
14.3%
8844
14.3%
23208
 
5.2%
12479
 
4.0%
32033
 
3.3%
41031
 
1.7%
593
 
0.2%

Manager_Grade
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct10
Distinct (%)0.1%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean3.264133876
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:21.660080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q34
95-th percentile6
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.137448815
Coefficient of variation (CV)0.3484688001
Kurtosis1.40160972
Mean3.264133876
Median Absolute Deviation (MAD)1
Skewness0.9952793268
Sum28868
Variance1.293789807
MonotonicityNot monotonic
2021-07-27T11:14:21.757030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
33207
33.7%
22471
25.9%
42038
21.4%
5666
 
7.0%
6406
 
4.3%
722
 
0.2%
814
 
0.1%
18
 
0.1%
97
 
0.1%
105
 
0.1%
(Missing)683
 
7.2%
ValueCountFrequency (%)
18
 
0.1%
22471
25.9%
33207
33.7%
42038
21.4%
5666
 
7.0%
6406
 
4.3%
722
 
0.2%
814
 
0.1%
97
 
0.1%
105
 
0.1%
ValueCountFrequency (%)
105
 
0.1%
97
 
0.1%
814
 
0.1%
722
 
0.2%
6406
 
4.3%
5666
 
7.0%
42038
21.4%
33207
33.7%
22471
25.9%
18
 
0.1%

Manager_Status
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
Confirmation
5277 
Probation
3567 

Length

Max length12
Median length12
Mean length10.79002714
Min length9

Characters and Unicode

Total characters95427
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowConfirmation
2nd rowConfirmation
3rd rowConfirmation
4th rowConfirmation
5th rowConfirmation

Common Values

ValueCountFrequency (%)
Confirmation5277
55.4%
Probation3567
37.4%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:22.005697image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:22.090028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
confirmation5277
59.7%
probation3567
40.3%

Most occurring characters

ValueCountFrequency (%)
o17688
18.5%
n14121
14.8%
i14121
14.8%
r8844
9.3%
a8844
9.3%
t8844
9.3%
C5277
 
5.5%
f5277
 
5.5%
m5277
 
5.5%
P3567
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter86583
90.7%
Uppercase Letter8844
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o17688
20.4%
n14121
16.3%
i14121
16.3%
r8844
10.2%
a8844
10.2%
t8844
10.2%
f5277
 
6.1%
m5277
 
6.1%
b3567
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
C5277
59.7%
P3567
40.3%

Most occurring scripts

ValueCountFrequency (%)
Latin95427
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o17688
18.5%
n14121
14.8%
i14121
14.8%
r8844
9.3%
a8844
9.3%
t8844
9.3%
C5277
 
5.5%
f5277
 
5.5%
m5277
 
5.5%
P3567
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII95427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o17688
18.5%
n14121
14.8%
i14121
14.8%
r8844
9.3%
a8844
9.3%
t8844
9.3%
C5277
 
5.5%
f5277
 
5.5%
m5277
 
5.5%
P3567
 
3.7%

Manager_Gender
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
M
7627 
F
1217 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8844
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
M7627
80.1%
F1217
 
12.8%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:22.296838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:22.369647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
m7627
86.2%
f1217
 
13.8%

Most occurring characters

ValueCountFrequency (%)
M7627
86.2%
F1217
 
13.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter8844
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M7627
86.2%
F1217
 
13.8%

Most occurring scripts

ValueCountFrequency (%)
Latin8844
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M7627
86.2%
F1217
 
13.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII8844
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M7627
86.2%
F1217
 
13.8%

Manager_DoB
Categorical

HIGH CARDINALITY
MISSING

Distinct1245
Distinct (%)14.1%
Missing683
Missing (%)7.2%
Memory size74.6 KiB
2/11/1961
 
45
1/7/1976
 
37
5/22/1974
 
30
2/7/1971
 
30
5/27/1955
 
29
Other values (1240)
8673 

Length

Max length10
Median length9
Mean length8.810945274
Min length8

Characters and Unicode

Total characters77924
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)1.6%

Sample

1st row2/17/1978
2nd row2/17/1978
3rd row3/3/1969
4th row8/14/1978
5th row2/7/1971

Common Values

ValueCountFrequency (%)
2/11/196145
 
0.5%
1/7/197637
 
0.4%
5/22/197430
 
0.3%
2/7/197130
 
0.3%
5/27/195529
 
0.3%
7/2/196728
 
0.3%
5/26/197428
 
0.3%
10/25/197727
 
0.3%
5/12/197427
 
0.3%
9/16/196726
 
0.3%
Other values (1235)8537
89.6%
(Missing)683
 
7.2%

Length

2021-07-27T11:14:22.627495image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2/11/196145
 
0.5%
1/7/197637
 
0.4%
2/7/197130
 
0.3%
5/22/197430
 
0.3%
5/27/195529
 
0.3%
7/2/196728
 
0.3%
5/26/197428
 
0.3%
5/12/197427
 
0.3%
10/25/197727
 
0.3%
9/16/196726
 
0.3%
Other values (1235)8537
96.5%

Most occurring characters

ValueCountFrequency (%)
/17688
22.7%
116941
21.7%
910922
14.0%
78480
10.9%
25441
 
7.0%
64764
 
6.1%
83495
 
4.5%
32955
 
3.8%
52692
 
3.5%
42463
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number60236
77.3%
Other Punctuation17688
 
22.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
116941
28.1%
910922
18.1%
78480
14.1%
25441
 
9.0%
64764
 
7.9%
83495
 
5.8%
32955
 
4.9%
52692
 
4.5%
42463
 
4.1%
02083
 
3.5%
Other Punctuation
ValueCountFrequency (%)
/17688
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common77924
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/17688
22.7%
116941
21.7%
910922
14.0%
78480
10.9%
25441
 
7.0%
64764
 
6.1%
83495
 
4.5%
32955
 
3.8%
52692
 
3.5%
42463
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII77924
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/17688
22.7%
116941
21.7%
910922
14.0%
78480
10.9%
25441
 
7.0%
64764
 
6.1%
83495
 
4.5%
32955
 
3.8%
52692
 
3.5%
42463
 
3.2%

Manager_Num_Application
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct17
Distinct (%)0.2%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean1.939733152
Minimum0
Maximum22
Zeros2980
Zeros (%)31.3%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:22.750840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum22
Range22
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.150528806
Coefficient of variation (CV)1.108672501
Kurtosis3.68934091
Mean1.939733152
Median Absolute Deviation (MAD)1
Skewness1.53618874
Sum17155
Variance4.624774146
MonotonicityNot monotonic
2021-07-27T11:14:22.850570image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
02980
31.3%
11677
17.6%
21339
14.1%
31073
 
11.3%
4710
 
7.5%
5458
 
4.8%
6270
 
2.8%
7132
 
1.4%
884
 
0.9%
963
 
0.7%
Other values (7)58
 
0.6%
(Missing)683
 
7.2%
ValueCountFrequency (%)
02980
31.3%
11677
17.6%
21339
14.1%
31073
 
11.3%
4710
 
7.5%
5458
 
4.8%
6270
 
2.8%
7132
 
1.4%
884
 
0.9%
963
 
0.7%
ValueCountFrequency (%)
221
 
< 0.1%
166
 
0.1%
141
 
< 0.1%
134
 
< 0.1%
125
 
0.1%
1114
 
0.1%
1027
 
0.3%
963
0.7%
884
0.9%
7132
1.4%

Manager_Num_Coded
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct10
Distinct (%)0.1%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean0.7589326097
Minimum0
Maximum9
Zeros5283
Zeros (%)55.5%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:22.955187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.188643743
Coefficient of variation (CV)1.566204598
Kurtosis4.693823949
Mean0.7589326097
Median Absolute Deviation (MAD)0
Skewness1.975780583
Sum6712
Variance1.412873948
MonotonicityNot monotonic
2021-07-27T11:14:23.049071image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
05283
55.5%
11799
 
18.9%
2936
 
9.8%
3471
 
4.9%
4225
 
2.4%
583
 
0.9%
630
 
0.3%
77
 
0.1%
86
 
0.1%
94
 
< 0.1%
(Missing)683
 
7.2%
ValueCountFrequency (%)
05283
55.5%
11799
 
18.9%
2936
 
9.8%
3471
 
4.9%
4225
 
2.4%
583
 
0.9%
630
 
0.3%
77
 
0.1%
86
 
0.1%
94
 
< 0.1%
ValueCountFrequency (%)
94
 
< 0.1%
86
 
0.1%
77
 
0.1%
630
 
0.3%
583
 
0.9%
4225
 
2.4%
3471
 
4.9%
2936
 
9.8%
11799
 
18.9%
05283
55.5%

Manager_Business
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct3747
Distinct (%)42.4%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean184370.9734
Minimum-265289
Maximum3578265
Zeros2904
Zeros (%)30.5%
Negative14
Negative (%)0.1%
Memory size74.6 KiB
2021-07-27T11:14:23.172342image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-265289
5-th percentile0
Q10
median102178
Q3247116.5
95-th percentile692513
Maximum3578265
Range3843554
Interquartile range (IQR)247116.5

Descriptive statistics

Standard deviation274716.3231
Coefficient of variation (CV)1.490019378
Kurtosis18.15642379
Mean184370.9734
Median Absolute Deviation (MAD)102178
Skewness3.366590283
Sum1630576889
Variance7.546905817 × 1010
MonotonicityNot monotonic
2021-07-27T11:14:23.316833image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02904
30.5%
2000034
 
0.4%
5000019
 
0.2%
2500014
 
0.1%
20000010
 
0.1%
3000010
 
0.1%
30700310
 
0.1%
3029119
 
0.1%
5745208
 
0.1%
529008
 
0.1%
Other values (3737)5818
61.1%
(Missing)683
 
7.2%
ValueCountFrequency (%)
-2652893
 
< 0.1%
-2507571
 
< 0.1%
-745874
 
< 0.1%
-284191
 
< 0.1%
-258891
 
< 0.1%
-184721
 
< 0.1%
-24081
 
< 0.1%
-2122
 
< 0.1%
02904
30.5%
4661
 
< 0.1%
ValueCountFrequency (%)
35782651
< 0.1%
34237401
< 0.1%
27923962
< 0.1%
25992951
< 0.1%
24134402
< 0.1%
24034572
< 0.1%
23061951
< 0.1%
21485321
< 0.1%
21050921
< 0.1%
20683351
< 0.1%

Manager_Num_Products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct57
Distinct (%)0.6%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean7.152306649
Minimum0
Maximum101
Zeros2909
Zeros (%)30.5%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:23.477373image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q311
95-th percentile23
Maximum101
Range101
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.439350937
Coefficient of variation (CV)1.179948141
Kurtosis8.452121926
Mean7.152306649
Median Absolute Deviation (MAD)5
Skewness2.053397802
Sum63255
Variance71.22264423
MonotonicityNot monotonic
2021-07-27T11:14:23.625201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02909
30.5%
6432
 
4.5%
5426
 
4.5%
4398
 
4.2%
7384
 
4.0%
8359
 
3.8%
9331
 
3.5%
3313
 
3.3%
11310
 
3.3%
1292
 
3.1%
Other values (47)2690
28.2%
(Missing)683
 
7.2%
ValueCountFrequency (%)
02909
30.5%
1292
 
3.1%
2288
 
3.0%
3313
 
3.3%
4398
 
4.2%
5426
 
4.5%
6432
 
4.5%
7384
 
4.0%
8359
 
3.8%
9331
 
3.5%
ValueCountFrequency (%)
1012
 
< 0.1%
742
 
< 0.1%
664
< 0.1%
611
 
< 0.1%
605
0.1%
591
 
< 0.1%
532
 
< 0.1%
515
0.1%
481
 
< 0.1%
474
< 0.1%

Manager_Business2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct3743
Distinct (%)42.3%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean182926.344
Minimum-265289
Maximum3578265
Zeros2909
Zeros (%)30.5%
Negative14
Negative (%)0.1%
Memory size74.6 KiB
2021-07-27T11:14:23.782283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-265289
5-th percentile0
Q10
median101714
Q3246461.25
95-th percentile676045.85
Maximum3578265
Range3843554
Interquartile range (IQR)246461.25

Descriptive statistics

Standard deviation271802.1459
Coefficient of variation (CV)1.485855673
Kurtosis18.58394594
Mean182926.344
Median Absolute Deviation (MAD)101714
Skewness3.382484489
Sum1617800586
Variance7.387640651 × 1010
MonotonicityNot monotonic
2021-07-27T11:14:23.927892image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02909
30.5%
2000034
 
0.4%
5000019
 
0.2%
2500014
 
0.1%
3000010
 
0.1%
20000010
 
0.1%
30700310
 
0.1%
3029119
 
0.1%
7140008
 
0.1%
116408
 
0.1%
Other values (3733)5813
61.0%
(Missing)683
 
7.2%
ValueCountFrequency (%)
-2652893
 
< 0.1%
-2507571
 
< 0.1%
-745874
 
< 0.1%
-284191
 
< 0.1%
-258891
 
< 0.1%
-184721
 
< 0.1%
-24081
 
< 0.1%
-2122
 
< 0.1%
02909
30.5%
4661
 
< 0.1%
ValueCountFrequency (%)
35782651
< 0.1%
34237401
< 0.1%
27923962
< 0.1%
25992951
< 0.1%
24134402
< 0.1%
24034572
< 0.1%
23061951
< 0.1%
21485321
< 0.1%
21050921
< 0.1%
20683351
< 0.1%

Manager_Num_Products2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct57
Distinct (%)0.6%
Missing683
Missing (%)7.2%
Infinite0
Infinite (%)0.0%
Mean7.131275441
Minimum0
Maximum101
Zeros2914
Zeros (%)30.6%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2021-07-27T11:14:24.088542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q311
95-th percentile23
Maximum101
Range101
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.42359672
Coefficient of variation (CV)1.181218814
Kurtosis8.552072166
Mean7.131275441
Median Absolute Deviation (MAD)5
Skewness2.06428762
Sum63069
Variance70.9569817
MonotonicityNot monotonic
2021-07-27T11:14:24.233757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02914
30.6%
6435
 
4.6%
5424
 
4.5%
4405
 
4.3%
7385
 
4.0%
8355
 
3.7%
9331
 
3.5%
3314
 
3.3%
11311
 
3.3%
1290
 
3.0%
Other values (47)2680
28.1%
(Missing)683
 
7.2%
ValueCountFrequency (%)
02914
30.6%
1290
 
3.0%
2287
 
3.0%
3314
 
3.3%
4405
 
4.3%
5424
 
4.5%
6435
 
4.6%
7385
 
4.0%
8355
 
3.7%
9331
 
3.5%
ValueCountFrequency (%)
1012
 
< 0.1%
742
 
< 0.1%
664
< 0.1%
611
 
< 0.1%
605
0.1%
591
 
< 0.1%
532
 
< 0.1%
515
0.1%
481
 
< 0.1%
474
< 0.1%

Business_Sourced
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size74.6 KiB
0
6260 
1
3267 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters9527
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Length

2021-07-27T11:14:24.493410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T11:14:24.773065image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Most occurring characters

ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9527
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Most occurring scripts

ValueCountFrequency (%)
Common9527
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII9527
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
06260
65.7%
13267
34.3%

Interactions

2021-07-27T11:14:03.272362image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:03.497549image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:03.644415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:03.860246image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.003553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.144269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.301094image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.445708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.602290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.747133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:04.888425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.017729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.146047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.289455image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.415629image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.558951image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.690246image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.833462image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:05.965492image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.102768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.227092image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.346804image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.467635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.588954image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.725292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.851599image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:06.988883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.115220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.253516image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.400505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.521108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.644406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.766337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:07.904625image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.123981image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.262239image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.390902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.527842image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.654203image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.774086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:08.895413image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.015572image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.152379image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.277295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.414933image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.540338image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.697026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.841311image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:09.980274image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.121816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.261131image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.416716image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.562751image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.719427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:10.864632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.006939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.137970image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.261881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.390532image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.515198image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.657518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.787794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:11.930091image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.062343image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.218076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.363280image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.502591image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.643267image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.782551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:12.938804image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.084794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.353167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.499340image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.642097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.772918image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:13.898268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:14.025604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:14.150854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:14.292769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:14.422779image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-27T11:14:14.565494image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-07-27T11:14:24.907262image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-27T11:14:25.169060image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-27T11:14:25.408053image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-27T11:14:25.668735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-27T11:14:25.994025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-27T11:14:14.923995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-27T11:14:15.951452image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-07-27T11:14:16.501383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-07-27T11:14:17.260005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

IDOffice_PINApplication_Receipt_DateApplicant_City_PINApplicant_GenderApplicant_BirthDateApplicant_Marital_StatusApplicant_OccupationApplicant_QualificationManager_DOJManager_Joining_DesignationManager_Current_DesignationManager_GradeManager_StatusManager_GenderManager_DoBManager_Num_ApplicationManager_Num_CodedManager_BusinessManager_Num_ProductsManager_Business2Manager_Num_Products2Business_Sourced
0FIN10000018420014/16/2007844120.0M12/19/1971MOthersGraduate11/10/2005Level 1Level 23.0ConfirmationM2/17/19782.01.0335249.028.0335249.028.00
1FIN10000028420014/16/2007844111.0M2/17/1983SOthersClass XII11/10/2005Level 1Level 23.0ConfirmationM2/17/19782.01.0335249.028.0335249.028.01
2FIN10000038000014/16/2007844101.0M1/16/1966MBusinessClass XII5/27/2006Level 1Level 12.0ConfirmationM3/3/19690.00.0357184.024.0357184.024.00
3FIN10000048141124/16/2007814112.0M2/3/1988SSalariedClass XII8/21/2003Level 1Level 34.0ConfirmationF8/14/19780.00.0318356.022.0318356.022.00
4FIN10000058141124/16/2007815351.0M7/4/1985MOthersClass XII5/8/2006Level 1Level 12.0ConfirmationM2/7/19712.01.0230402.017.0230402.017.00
5FIN10000068141124/16/2007814114.0M3/23/1988SOthersClass XII1/17/2006Level 1Level 12.0ConfirmationM2/20/19790.00.0247118.024.0247118.024.01
6FIN10000078420014/16/2007844118.0M2/5/1969MBusinessClass XII9/1/2003Level 1Level 12.0ConfirmationM5/28/19690.00.0315119.027.0315119.027.01
7FIN10000088000014/16/2007844103.0M1/28/1984MSalariedClass XII12/16/2006Level 1Level 12.0ConfirmationM1/7/19765.04.0117358.09.0117358.09.00
8FIN10000092096254/16/2007206451.0M1/8/1976MBusinessGraduate11/18/2004Level 1Level 23.0ConfirmationM3/7/19660.00.0244028.017.0244028.017.01
9FIN10000102110014/16/2007212218.0M2/3/1982MOthersClass XII8/15/2002Level 1Level 34.0ConfirmationM11/14/19740.00.0851557.039.0851557.039.01

Last rows

IDOffice_PINApplication_Receipt_DateApplicant_City_PINApplicant_GenderApplicant_BirthDateApplicant_Marital_StatusApplicant_OccupationApplicant_QualificationManager_DOJManager_Joining_DesignationManager_Current_DesignationManager_GradeManager_StatusManager_GenderManager_DoBManager_Num_ApplicationManager_Num_CodedManager_BusinessManager_Num_ProductsManager_Business2Manager_Num_Products2Business_Sourced
9517FIN10095181210027/1/2008121005.0M11/4/1980SNaNGraduate6/2/2008Level 3Level 34.0ProbationM10/17/19730.00.00.00.00.00.00
9518FIN10095191210027/1/2008121004.0M7/5/1981MNaNGraduate6/2/2008Level 3Level 34.0ProbationM10/17/19730.00.00.00.00.00.00
9519FIN10095202500017/1/2008250004.0F12/3/1965MNaNGraduate6/28/2006Level 1Level 23.0ConfirmationM6/3/19741.01.055000.02.055000.02.00
9520FIN10095217530127/1/2008754031.0M4/26/1984SNaNClass XIINaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0
9521FIN10095228141127/1/2008816118.0M11/20/1969MNaNClass XII10/3/2006Level 1Level 12.0ConfirmationM9/26/19554.02.0418339.013.0418339.013.00
9522FIN10095231600177/1/2008160032.0M1/18/1970MSalariedGraduate5/5/2008Level 2Level 23.0ProbationM5/10/19670.00.00.00.00.00.00
9523FIN10095248481017/1/2008848302.0M9/11/1956MNaNGraduateNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0
9524FIN10095257530127/1/2008753014.0F8/7/1975MSalariedGraduate8/22/2006Level 2Level 23.0ConfirmationM7/22/19700.00.0316126.09.0305775.08.00
9525FIN10095265750037/1/2008571248.0M12/23/1986SSalariedClass XII6/5/2008Level 3Level 34.0ProbationM9/23/19760.00.00.00.00.00.00
9526FIN10095274110067/1/2008411006.0F2/7/1976MOthersGraduateNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0